Skip to main content

Reply to this post | See parent post | Go Back
View Post [edit]

Poster: pmocek Date: Oct 4, 2013 9:10am
Forum: opensource Subject: Re: Derive failed on 450MB batch of PDFs

Jeff, can you recommend any documentation of best practices for publishing a large set like this to Internet Archive? I'm trying to help crowdsource review of these federal contracts with war/intel contractor Booz Allen Hamilton, and I was hoping to take advantage of your OCRing and torrenting.

Reply to this post
Reply [edit]

Poster: Jeff Kaplan Date: Oct 4, 2013 10:08am
Forum: opensource Subject: Re: Derive failed on 450MB batch of PDFs

i'd suggest starting a new item page for the pdfs in each directory. identifiers like this would make sense to me:
2013FOIABoozAllenFedContracts-air_force
2013FOIABoozAllenFedContracts-dept_of_agriculture_ag_research_service
2013FOIABoozAllenFedContracts-dept_of_defense_education_activity
2013FOIABoozAllenFedContracts-dept_of_transportation
2013FOIABoozAllenFedContracts-federal_aviation_administration
2013FOIABoozAllenFedContracts-federal_energy_regulatory_commission
2013FOIABoozAllenFedContracts-food_and_drug_administration
2013FOIABoozAllenFedContracts-national_cancer_institute
2013FOIABoozAllenFedContracts-patent_and_trademark_office
2013FOIABoozAllenFedContracts-united_states_postal_service

use archive,org/upload with Chrome, Firfox or Safari and just drage the pdf files into the graybox on the start page. do not try to upload a directory. it won't work.

When you're done let me know and i can remove https://archive.org/details/2013FOIABoozAllenHamiltonFederalContracts

hpoe this helps.

Reply to this post
Reply [edit]

Poster: Jeff Kaplan Date: Oct 4, 2013 1:57pm
Forum: opensource Subject: Re: Derive failed on 450MB batch of PDFs

until you have a collection i can only suggest putting in the description fields on each item links to all or a search query that brings them all up.

the item has been removed.

This post was modified by Jeff Kaplan on 2013-10-04 20:57:31